AITopics | imputation function

Collaborating Authors

imputation function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Missing At Random as Covariate Shift: Correcting Bias in Iterative Imputation

Shannon, Luke, Liu, Song, Reluga, Katarzyna

arXiv.org Machine LearningFeb-9-2026

Accurate imputation of missing data is critical to downstream machine learning performance. We formulate missing data imputation as a risk minimisation problem, which highlights a covariate shift between the observed and unobserved data distributions. This covariate shift induced bias is not accounted for by popular imputation methods and leads to suboptimal performance. In this paper, we derive theoretically valid importance weights that correct for the induced distributional bias. Furthermore, we propose a novel imputation algorithm that jointly estimates both the importance weights and imputation models, enabling bias correction throughout the imputation process. Empirical results across benchmark datasets show reductions in root mean squared error and Wasserstein distance of up to 7% and 20%, respectively, compared to otherwise identical unweighted methods.

artificial intelligence, imputation, machine learning, (18 more...)

arXiv.org Machine Learning

2602.06713

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Bristol (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Weighted Conformal Prediction Provides Adaptive and Valid Mask-Conditional Coverage for General Missing Data Mechanisms

Fan, Jiarong, Vo, Juhyun Park. Thi Phuong Thuy, Brunel, Nicolas

arXiv.org Machine LearningDec-17-2025

Conformal prediction (CP) offers a principled framework for uncertainty quantification, but it fails to guarantee coverage when faced with missing covariates. In addressing the heterogeneity induced by various missing patterns, Mask-Conditional Valid (MCV) Coverage has emerged as a more desirable property than Marginal Coverage. In this work, we adapt split CP to handle missing values by proposing a preimpute-mask-then-correct framework that can offer valid coverage. We show that our method provides guaranteed Marginal Coverage and Mask-Conditional Validity for general missing data mechanisms. A key component of our approach is a reweighted conformal prediction procedure that corrects the prediction sets after distributional imputation (multiple imputation) of the calibration dataset, making our method compatible with standard imputation pipelines. We derive two algorithms, and we show that they are approximately marginally valid and MCV. We evaluate them on synthetic and real-world datasets. It reduces significantly the width of prediction intervals w.r.t standard MCV methods, while maintaining the target guarantees.

dataset, imputation, prediction, (15 more...)

arXiv.org Machine Learning

2512.14221

Country:

North America > United States (0.14)
Europe > France > Île-de-France > Paris > Paris (0.14)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Data Science > Data Quality (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

5fe8fdc79ce292c39c5f209d734b7206-Supplemental.pdf

Neural Information Processing SystemsAug-14-2025, 19:01:45 GMT

bayes predictor, imputation function, mis, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.47)

Add feedback

5fe8fdc79ce292c39c5f209d734b7206-Paper.pdf

Neural Information Processing SystemsAug-14-2025, 19:01:43 GMT

bayes predictor, imputation, imputation function, (14 more...)

Neural Information Processing Systems

Country: Europe > France > Occitanie > Hérault > Montpellier (0.04)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Modeling & Simulation (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Add feedback

A Unifying Framework for Robust and Efficient Inference with Unstructured Data

Carlson, Jacob, Dell, Melissa

arXiv.org Artificial IntelligenceJul-10-2025

This paper presents a general framework for conducting efficient inference on parameters derived from unstructured data, which include text, images, audio, and video. Economists have long used unstructured data by first extracting low-dimensional structured features (e.g., the topic or sentiment of a text), since the raw data are too high-dimensional and uninterpretable to include directly in empirical analyses. The rise of deep neural networks has accelerated this practice by greatly reducing the costs of extracting structured data at scale, but neural networks do not make generically unbiased predictions. This potentially propagates bias to the downstream estimators that incorporate imputed structured data, and the availability of different off-the-shelf neural networks with different biases moreover raises p-hacking concerns. To address these challenges, we reframe inference with unstructured data as a problem of missing structured data, where structured variables are imputed from high-dimensional unstructured inputs. This perspective allows us to apply classic results from semiparametric inference, leading to estimators that are valid, efficient, and robust. We formalize this approach with MAR-S, a framework that unifies and extends existing methods for debiased inference using machine learning predictions, connecting them to familiar problems such as causal inference. Within this framework, we develop robust and efficient estimators for both descriptive and causal estimands and address challenges like inference with aggregated and transformed missing structured data-a common scenario that is not covered by existing work. These methods-and the accompanying implementation package-provide economists with accessible tools for constructing unbiased estimators using unstructured data in a wide range of applications, as we demonstrate by re-analyzing several influential studies.

artificial intelligence, estimator, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2505.00282

Country:

North America > United States > Minnesota (0.27)
North America > United States > Massachusetts (0.27)

Genre: Research Report (1.00)

Industry:

Government (0.67)
Banking & Finance > Economy (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

What's a good imputation to predict with missing values?

Morvan, Marine Le, Josse, Julie, Scornet, Erwan, Varoquaux, Gaël

arXiv.org Machine LearningJun-1-2021

How to learn a good predictor on data with missing values? Most efforts focus on first imputing as well as possible and second learning on the completed data to predict the outcome. Yet, this widespread practice has no theoretical grounding. Here we show that for almost all imputation functions, an impute-then-regress procedure with a powerful learner is Bayes optimal. This result holds for all missing-values mechanisms, in contrast with the classic statistical results that require missing-at-random settings to use imputation in probabilistic modeling. Moreover, it implies that perfect conditional imputation may not be needed for good prediction asymptotically. In fact, we show that on perfectly imputed data the best regression function will generally be discontinuous, which makes it hard to learn. Crafting instead the imputation so as to leave the regression function unchanged simply shifts the problem to learning discontinuous imputations. Rather, we suggest that it is easier to learn imputation and regression jointly. We propose such a procedure, adapting NeuMiss, a neural network capturing the conditional links across observed and unobserved variables whatever the missing-value pattern. Experiments confirm that joint imputation and regression through NeuMiss is better than various two step procedures in our experiments with finite number of samples.

bayes predictor, imputation, mis, (15 more...)

arXiv.org Machine Learning

2106.00311

Country: Europe > France > Occitanie > Hérault > Montpellier (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Add feedback

missIWAE: Deep Generative Modelling and Imputation of Incomplete Data

Mattei, Pierre-Alexandre, Frellsen, Jes

arXiv.org Machine LearningDec-6-2018

We present a simple technique to train deep latent variable models (DLVMs) when the training set contains missing data. Our approach is based on the importance-weighted autoencoder (IWAE) of Burda et al. (2016), and also allows single or multiple imputation of the incomplete data set. We illustrate it by training a convolutional DLVM on a static binarisation of MNIST that contains 50% of missing data. Leveraging mutiple imputations, we train a convolutional network that classifies these incomplete digits as well as complete ones.

artificial intelligence, imputation, machine learning, (14 more...)

arXiv.org Machine Learning

1812.02633

Country:

North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback